library(bayesrules)
library(rstanarm)
## Loading required package: Rcpp
## This is rstanarm version 2.21.1
## - See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!
## - Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.
## - For execution on a local, multicore CPU with excess RAM we recommend calling
##   options(mc.cores = parallel::detectCores())
library(bayesplot)
## This is bayesplot version 1.8.1
## - Online documentation and vignettes at mc-stan.org/bayesplot
## - bayesplot theme set to bayesplot::theme_default()
##    * Does _not_ affect other ggplot2 plots
##    * See ?bayesplot_theme_set for details on theme setting
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.4     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   2.0.1     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(broom.mixed)
library(tidybayes)

Exercise 11.1

If we find a relationship between two independent variables and the dependent variable (and those two independent variables are independent of eachother), then we might want to build a model that incorporates both of their influences’ on the outcome variable.

Exercise 11.2

  1. Ford is the reference category

  2. The difference in a Ford car’s miles per gallon and a Subaru car’s miles per gallon.

  3. The typical miles per gallon of a Ford car.

Exercise 11.3

  1. B0 represents the typical size of a Mr Stripey tomato at 0 days of growth; B1 represents the amount a tomato increases in weight for 1 more day of growth; B2 represents the typical difference in weights between a Mr Stripey tomato and a Roma tomato at any day of age.

  2. If B2 were 0 there would be no difference in weight between Roma and Mr Stripey tomatoes.

Exercise 11.4

  1. The relationship between tomato size and age varies depending on the type of tomato.

  2. B3 represents the differences in the relationship between age and weight for the 2 different types of tomatoes (the different slopes that each of these relationships possess).

Exercise 11.5

11.5

Exercise 11.6

  1. By adding more predictors, we can improve the predictive accuracy of our posterior model.

  2. By removing predictors, we can better isolate the relationship between two variables.

  3. Height. Height has been found to correlate with foot size.

  4. Whether the child knows how to swim. I think the relationship between swimming ability and shoe size would be spurious, and would uncessarily complicate the model.

Exercise 11.7

  1. A good model produces a posterior distribution that closely matches the observed distribution; has a low MAE scaled; has a high percentage of values that fall within the within_50 interval.

  2. A bad model has a posterior distribution that deviates strongly from the shape of the observed distribution (perhaps the wrong type of model was chosen); has a high MAE scaled; has a low percentage of values that fall within the within_50 and within_95 interval.

Exercise 11.8

  1. visualization: use pp_check() to compare shape of observed and predicted distributions; use pp_intervals() to visually assess how the observed values compare to the posterior predicted intervals
  2. cross-validation: break the data set into k>=10 different folds; see how the model performs on each of these different subsets of the main dataset through observing the MAE, MAE scaled values, within_50, and within_95 for each fold (the methodology of the folds is that the first iteration trains on the first 9 subsets and tests on the 10th subset; the second iteration trains on the 2-10 and tests on the 1st; the third iteration trains on 1 and 3-10 and tests on the 2nd, etc…).
  3. ELPD: the larger the expected logged posterior predictive pdf across a new set of data points, the more accurate the posterior predictions of y. Calculate the ELPD for each model, and then compare them by looking at the absolute differences and the standard error differences.

Exercise 11.9

We want to include enough predictors that our model is accurate (i.e. closely fit to our data), but not so many predictors that our model is “overfit” (i.e. biased). Therefore, when considering how many predictor variables to include, we entertain the bias-variance tradeoff, considering the fact that we don’t want our model to be biased (overfit) nor do we want it to have too much variance (want it to be an accurate predictor).

Exercise 11.10

  1. Plotting penguin data
data("penguins_bayes")
# Alternative penguin data
penguin_data <- penguins_bayes |>
  filter(species %in% c("Adelie", "Gentoo")) |> drop_na()
ggplot(penguin_data, aes(x = flipper_length_mm, y = body_mass_g, color=species)) +
 geom_point() + 
  geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'

  1. Building the model
penguin_model_1 <- stan_glm(
  body_mass_g ~ flipper_length_mm + species,
  data = penguin_data, family = gaussian, 
  prior_intercept = normal(3100, 50),
  prior = normal(33, 2.5, autoscale = TRUE), 
  prior_aux = exponential(0.001, autoscale = TRUE),
  chains = 4, iter = 4000*2, seed = 84735, verbose=FALSE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0.000104 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.04 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.501541 seconds (Warm-up)
## Chain 1:                0.549886 seconds (Sampling)
## Chain 1:                1.05143 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 2.3e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.23 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.347482 seconds (Warm-up)
## Chain 2:                0.462969 seconds (Sampling)
## Chain 2:                0.810451 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.9e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.424767 seconds (Warm-up)
## Chain 3:                0.569065 seconds (Sampling)
## Chain 3:                0.993832 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.8e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.372732 seconds (Warm-up)
## Chain 4:                0.468452 seconds (Sampling)
## Chain 4:                0.841184 seconds (Total)
## Chain 4:
  1. Checking out some visual and numerical diagnostics of the model
pp_check(penguin_model_1)

Check out some draws from our model

penguin_data %>%
  add_fitted_draws(penguin_model_1, n = 50) %>%
  ggplot(aes(x = flipper_length_mm, y = body_mass_g, color = species)) +
    geom_line(aes(y = .value, group = paste(species, .draw)), alpha = 0.1)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

set.seed(84735)
predictions_1 <- posterior_predict(penguin_model_1, newdata = penguin_data)


ppc_intervals_grouped(penguin_data$body_mass_g, yrep = predictions_1, 
                      x = penguin_data$flipper_length_mm, group = penguin_data$species,
                      prob = 0.5, prob_outer = 0.95,
                      facet_args = list(scales = "fixed")) + 
  labs(x = "flipper_length_mm", y = "body_mass_g")

From both of these visual diagnostics, we can see that our model follows the shape of the distribution pretty well, and that the majority of our predicted values fall within at least the 95% posterior prediction interval.

Some numerical summaries

prediction_summary_cv(model = penguin_model_1, data = penguin_data, k = 10)
## $folds
##    fold      mae mae_scaled within_50 within_95
## 1     1 427.1594  0.6729862 0.4814815 1.0000000
## 2     2 537.3611  0.8637755 0.4230769 0.9230769
## 3     3 536.7048  0.8241943 0.4074074 0.8888889
## 4     4 619.8548  0.9002595 0.1923077 0.9615385
## 5     5 356.5858  0.5338050 0.5555556 0.9259259
## 6     6 405.2069  0.6354953 0.5000000 1.0000000
## 7     7 405.6854  0.6270410 0.5384615 0.9615385
## 8     8 504.9042  0.7843185 0.4074074 1.0000000
## 9     9 357.7319  0.6067848 0.5769231 0.9230769
## 10   10 579.3295  0.8830854 0.3333333 1.0000000
## 
## $cv
##        mae mae_scaled within_50 within_95
## 1 473.0524  0.7331745 0.4415954 0.9584046

The numerical statistics confirm what the visualizations demonstrated–most (95%) of our data are within the 95% prediction interval, and close to half are within the 50% prediction interval.

  1. Creating a tidy() summary of the model
tidy(penguin_model_1, effects = c("fixed", "aux"),
     conf.int = TRUE, conf.level = 0.80)

The flipper_length_mm coefficient refers to the amount of increase in body_mass caused by a one unit change in flipper_length for Adelie penguins. The speciesGentoo coefficient refers to the difference in body_mass_g for a given flipper_length between the Adelie and Gentoo species.

  1. Simulating and plotting posterior model for an Adelie penguin with flipper length 197.
# Simulate a set of predictions
set.seed(84735)
prediction <- posterior_predict(
  penguin_model_1,
  newdata = data.frame(flipper_length_mm= 197, 
                       species = "Adelie"))

# Plot the posterior predictive models
mcmc_areas(prediction) +
  xlab("body_mass_g")

The body_mass of an Adelie penguin with flipper length 197 is around 3700, with likely values ranging from 3250 to 4000.

Exercise 11.11

  1. Modeling body mass by flipper length and species including an interaction term.
penguin_model_2 <- stan_glm(
  body_mass_g ~ flipper_length_mm + species + flipper_length_mm:species,
  data = penguin_data, family = gaussian, 
  prior_intercept = normal(3100, 50),
  prior = normal(33, 2.5, autoscale = TRUE), 
  prior_aux = exponential(0.001, autoscale = TRUE),
  chains = 4, iter = 4000*2, seed = 84735, verbose=FALSE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2.4e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.24 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 2.94285 seconds (Warm-up)
## Chain 1:                3.77951 seconds (Sampling)
## Chain 1:                6.72236 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.7e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 2.89972 seconds (Warm-up)
## Chain 2:                3.40714 seconds (Sampling)
## Chain 2:                6.30686 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 2.2e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.22 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 3.00008 seconds (Warm-up)
## Chain 3:                3.60545 seconds (Sampling)
## Chain 3:                6.60553 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.8e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 2.90421 seconds (Warm-up)
## Chain 4:                3.26702 seconds (Sampling)
## Chain 4:                6.17123 seconds (Total)
## Chain 4:
  1. Simulating and plotting 50 posterior lines
penguin_data %>%
  add_fitted_draws(penguin_model_2, n = 50) %>%
  ggplot(aes(x = flipper_length_mm, y = body_mass_g, color = species)) +
    geom_line(aes(y = .value, group = paste(species, .draw)), alpha = 0.1)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

The slope of the line is slightly less steep than the model without the interaction term for both Adelie and Gentoo penguins. This implies that the interaction “dampens” the strength of the relationship between flipper_length and body_mass for both species. We can estimate the overall slope (including the relationship between flipper_length_mm and body_mass_g and the relationship between the interaction between flipper_length_mm/species and body_mass_g) as around 33 for Adelie penguins, and 50 for Gentoo penguins.

  1. Using tidy() summary
tidy(penguin_model_2, effects = c("fixed", "aux"),
     conf.int = TRUE, conf.level = 0.80)

The coefficient for the speciesGentoo term changed significantly with the interaction term included in the regression model (from 234 to -3899) despite the fact that we used the same priors (so the only difference between the two models was the interaction term). Therefore, the interaction must have a strong effect, for when we allow the interaction to be present, the species coefficient changes in value greatly. (CHECK THIS LOGIC )

Exercise 11.12

  1. Simulating model with 3 predictors
penguin_model_3 <- stan_glm(
  body_mass_g ~ flipper_length_mm + bill_length_mm + bill_depth_mm,
  data = penguin_data, family = gaussian, 
  prior_intercept = normal(3100, 50),
  prior = normal(33, 2.5, autoscale = TRUE), 
  prior_aux = exponential(0.001, autoscale = TRUE),
  chains = 4, iter = 4000*2, seed = 84735, verbose=FALSE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2.6e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.26 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.366585 seconds (Warm-up)
## Chain 1:                0.478134 seconds (Sampling)
## Chain 1:                0.844719 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.7e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.381986 seconds (Warm-up)
## Chain 2:                0.466469 seconds (Sampling)
## Chain 2:                0.848455 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.8e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.376459 seconds (Warm-up)
## Chain 3:                0.505972 seconds (Sampling)
## Chain 3:                0.882431 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 2.1e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.367853 seconds (Warm-up)
## Chain 4:                0.461537 seconds (Sampling)
## Chain 4:                0.82939 seconds (Total)
## Chain 4:
  1. Using posterior_interval() to produce 95% credible intervals for the model parameters
posterior_interval(penguin_model_3, prob=0.8)
##                           10%         90%
## (Intercept)       -7378.32923 -5480.31256
## flipper_length_mm    27.45303    37.79006
## bill_length_mm       56.63118    83.84360
## bill_depth_mm        29.00648    74.40261
## sigma               406.01564   490.09006
  1. All variables have a positive association with body mass.

Exercise 11.13

  1. Simulating the 4 models:
penguin_model_4 <- stan_glm(
  body_mass_g ~ flipper_length_mm,
  data = penguin_data, family = gaussian, 
  prior_intercept = normal(3100, 50),
  prior = normal(30, 2.5, autoscale = TRUE), 
  prior_aux = exponential(0.001, autoscale = TRUE),
  chains = 4, iter = 4000*2, seed = 84735, verbose=FALSE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 3.2e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.32 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.304403 seconds (Warm-up)
## Chain 1:                0.411456 seconds (Sampling)
## Chain 1:                0.715859 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 2.3e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.23 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.247148 seconds (Warm-up)
## Chain 2:                0.369469 seconds (Sampling)
## Chain 2:                0.616617 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.7e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.244583 seconds (Warm-up)
## Chain 3:                0.370159 seconds (Sampling)
## Chain 3:                0.614742 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.7e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.239413 seconds (Warm-up)
## Chain 4:                0.351681 seconds (Sampling)
## Chain 4:                0.591094 seconds (Total)
## Chain 4:
penguin_model_5 <- stan_glm(
  body_mass_g ~ species,
  data = penguin_data, family = gaussian, 
  prior_intercept = normal(3100, 50),
  prior = normal(30, 2.5, autoscale = TRUE), 
  prior_aux = exponential(0.001, autoscale = TRUE),
  chains = 4, iter = 4000*2, seed = 84735, verbose=FALSE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2.1e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.243102 seconds (Warm-up)
## Chain 1:                0.333097 seconds (Sampling)
## Chain 1:                0.576199 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.8e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.235263 seconds (Warm-up)
## Chain 2:                0.335858 seconds (Sampling)
## Chain 2:                0.571121 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.6e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.245487 seconds (Warm-up)
## Chain 3:                0.37614 seconds (Sampling)
## Chain 3:                0.621627 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.6e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.256648 seconds (Warm-up)
## Chain 4:                0.347057 seconds (Sampling)
## Chain 4:                0.603705 seconds (Total)
## Chain 4:
penguin_model_6 <- stan_glm(
  body_mass_g ~ flipper_length_mm +species,
  data = penguin_data, family = gaussian, 
  prior_intercept = normal(3100, 50),
  prior = normal(30, 2.5, autoscale = TRUE), 
  prior_aux = exponential(0.001, autoscale = TRUE),
  chains = 4, iter = 4000*2, seed = 84735, verbose=FALSE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2.2e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.22 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.412836 seconds (Warm-up)
## Chain 1:                0.379282 seconds (Sampling)
## Chain 1:                0.792118 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 2.3e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.23 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.352843 seconds (Warm-up)
## Chain 2:                0.428169 seconds (Sampling)
## Chain 2:                0.781012 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 2.7e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.27 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.350186 seconds (Warm-up)
## Chain 3:                0.474125 seconds (Sampling)
## Chain 3:                0.824311 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 2.3e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.23 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.358834 seconds (Warm-up)
## Chain 4:                0.477033 seconds (Sampling)
## Chain 4:                0.835867 seconds (Total)
## Chain 4:
penguin_model_7 <- stan_glm(
  body_mass_g ~ flipper_length_mm +bill_length_mm + bill_depth_mm,
  data = penguin_data, family = gaussian, 
  prior_intercept = normal(3100, 50),
  prior = normal(30, 2.5, autoscale = TRUE), 
  prior_aux = exponential(0.001, autoscale = TRUE),
  chains = 4, iter = 4000*2, seed = 84735, verbose=FALSE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2.3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.23 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.367614 seconds (Warm-up)
## Chain 1:                0.431276 seconds (Sampling)
## Chain 1:                0.79889 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.8e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.362113 seconds (Warm-up)
## Chain 2:                0.472001 seconds (Sampling)
## Chain 2:                0.834114 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 2.1e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.360098 seconds (Warm-up)
## Chain 3:                0.461703 seconds (Sampling)
## Chain 3:                0.821801 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.9e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.393149 seconds (Warm-up)
## Chain 4:                0.493542 seconds (Sampling)
## Chain 4:                0.886691 seconds (Total)
## Chain 4:
  1. Checking the models using pp_check()
pp_check(penguin_model_4)

pp_check(penguin_model_5)

pp_check(penguin_model_6)

pp_check(penguin_model_7)

  1. Running prediction_summary_cv()
prediction_summary_cv(model = penguin_model_4, data = penguin_data, k = 10)
## $folds
##    fold      mae mae_scaled within_50 within_95
## 1     1 357.2165  0.5639881 0.5185185 0.8888889
## 2     2 460.9673  0.6774634 0.5000000 0.9615385
## 3     3 398.3068  0.6603842 0.5185185 0.9259259
## 4     4 587.2437  0.9852940 0.3076923 0.8846154
## 5     5 510.9056  0.8015057 0.4444444 1.0000000
## 6     6 424.3274  0.6659373 0.5000000 0.9615385
## 7     7 460.8302  0.6769068 0.5000000 1.0000000
## 8     8 646.7795  1.0032701 0.2592593 0.9259259
## 9     9 432.2585  0.6410494 0.5384615 1.0000000
## 10   10 464.0117  0.6713420 0.4814815 1.0000000
## 
## $cv
##        mae mae_scaled within_50 within_95
## 1 474.2847  0.7347141 0.4568376 0.9548433
prediction_summary_cv(model = penguin_model_5, data = penguin_data, k = 10)
## $folds
##    fold      mae mae_scaled within_50 within_95
## 1     1 571.9361  0.7122460 0.4814815 1.0000000
## 2     2 923.5512  1.2350226 0.2307692 0.9615385
## 3     3 686.8489  0.8683759 0.3703704 0.9629630
## 4     4 690.1843  0.9220871 0.3846154 0.8846154
## 5     5 457.4796  0.5597983 0.5925926 1.0000000
## 6     6 793.3704  0.9469641 0.3846154 0.9615385
## 7     7 530.6584  0.6196611 0.5384615 1.0000000
## 8     8 515.8149  0.6445739 0.5185185 0.9629630
## 9     9 569.8630  0.6754785 0.5000000 1.0000000
## 10   10 540.4538  0.6566891 0.5185185 1.0000000
## 
## $cv
##        mae mae_scaled within_50 within_95
## 1 628.0161  0.7840896 0.4519943 0.9733618
prediction_summary_cv(model = penguin_model_6, data = penguin_data, k = 10)
## $folds
##    fold      mae mae_scaled within_50 within_95
## 1     1 528.3690  0.8317152 0.3703704 1.0000000
## 2     2 401.8910  0.6329174 0.5000000 0.9615385
## 3     3 435.7046  0.6584721 0.5185185 0.9629630
## 4     4 535.1367  0.8440899 0.4230769 0.9615385
## 5     5 440.9778  0.6839468 0.4814815 0.9629630
## 6     6 646.0260  0.9511811 0.4230769 0.9615385
## 7     7 465.5016  0.7201086 0.4230769 0.9230769
## 8     8 418.6512  0.6450584 0.5185185 1.0000000
## 9     9 475.2119  0.7515951 0.3846154 0.9230769
## 10   10 472.2740  0.7695869 0.4074074 0.9259259
## 
## $cv
##        mae mae_scaled within_50 within_95
## 1 481.9744  0.7488672 0.4450142 0.9582621
prediction_summary_cv(model = penguin_model_7, data = penguin_data, k = 10)
## $folds
##    fold      mae mae_scaled within_50 within_95
## 1     1 399.7241  0.8801726 0.2962963 0.8888889
## 2     2 278.6517  0.5098146 0.6923077 1.0000000
## 3     3 323.5021  0.6108105 0.5185185 0.9259259
## 4     4 373.6903  0.7309572 0.4615385 0.9615385
## 5     5 501.4144  0.9351296 0.3333333 1.0000000
## 6     6 227.3234  0.4555048 0.5769231 0.9615385
## 7     7 365.6826  0.7485364 0.4615385 0.9230769
## 8     8 426.2163  0.8059264 0.3703704 1.0000000
## 9     9 389.0706  0.7205595 0.4615385 0.9615385
## 10   10 251.0481  0.4757839 0.5925926 0.9259259
## 
## $cv
##        mae mae_scaled within_50 within_95
## 1 353.6324  0.6873196 0.4764957 0.9548433
  1. Comparing ELPD for all 4 models
# Calculate ELPD for the 4 models
set.seed(84735)
loo_1 <- loo(penguin_model_4)
loo_2 <- loo(penguin_model_5)
loo_3 <- loo(penguin_model_6)
loo_4 <- loo(penguin_model_7)

# Results
c(loo_1$estimates[1], loo_2$estimates[1], 
  loo_3$estimates[1], loo_4$estimates[1])
## [1] -2053.593 -2124.003 -2052.450 -1994.006
loo_compare(loo_1, loo_2, loo_3, loo_4)
##                 elpd_diff se_diff
## penguin_model_7    0.0       0.0 
## penguin_model_6  -58.4       5.8 
## penguin_model_4  -59.6       5.6 
## penguin_model_5 -130.0       9.6
  1. The 4th model (in our code, penguin_model_7) has the lowest MAE_scaled (0.722), and the highest percentage of values within the 50% posterior predicted interval. The 4th model also has the best ELPD (the larger the ELPD, the more accurate the predictions)–we know this because loo_compare outputs the best model, and then lists all other models in comparison to the best model. Therefore, I would say the 4th model is the best.

Exercise 11.14

First, let’s look at the relationship between bill_length and bill_depth

penguin_data <- penguin_data |> drop_na()
ggplot(penguin_data, aes(x = bill_depth_mm, y = bill_length_mm)) +
 geom_point() + 
  geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'

Now, let’s check out the relationship with body_mass

penguin_data <- penguin_data |> drop_na()
ggplot(penguin_data, aes(x = body_mass_g, y = bill_length_mm)) +
 geom_point() + 
  geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'

Building three penguin models

penguin_model_8 <- stan_glm(
  bill_length_mm ~ bill_depth_mm,
  data = penguin_data, family = gaussian, 
  prior_intercept = normal(50, 5),
  prior = normal(-1.5, 0.005, autoscale = TRUE), 
  prior_aux = exponential(0.001, autoscale = TRUE),
  chains = 4, iter = 4000*2, seed = 84735, verbose=FALSE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2.6e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.26 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.170961 seconds (Warm-up)
## Chain 1:                0.261897 seconds (Sampling)
## Chain 1:                0.432858 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 2.6e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.26 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.172805 seconds (Warm-up)
## Chain 2:                0.320485 seconds (Sampling)
## Chain 2:                0.49329 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.7e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.155847 seconds (Warm-up)
## Chain 3:                0.270591 seconds (Sampling)
## Chain 3:                0.426438 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 2.2e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.22 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.152423 seconds (Warm-up)
## Chain 4:                0.272881 seconds (Sampling)
## Chain 4:                0.425304 seconds (Total)
## Chain 4:
penguin_model_9 <- stan_glm(
  bill_length_mm ~ body_mass_g,
  data = penguin_data, family = gaussian, 
  prior_intercept = normal(35, 5),
  prior = normal(0.005, 0.005, autoscale = TRUE), 
  prior_aux = exponential(0.001, autoscale = TRUE),
  chains = 4, iter = 4000*2, seed = 84735, verbose=FALSE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2.3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.23 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.157426 seconds (Warm-up)
## Chain 1:                0.27263 seconds (Sampling)
## Chain 1:                0.430056 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 2.3e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.23 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.150814 seconds (Warm-up)
## Chain 2:                0.269698 seconds (Sampling)
## Chain 2:                0.420512 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 2.1e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.159981 seconds (Warm-up)
## Chain 3:                0.267097 seconds (Sampling)
## Chain 3:                0.427078 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.8e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.157404 seconds (Warm-up)
## Chain 4:                0.272987 seconds (Sampling)
## Chain 4:                0.430391 seconds (Total)
## Chain 4:
penguin_model_10 <- stan_glm(
  bill_length_mm ~ bill_depth_mm+body_mass_g,
  data = penguin_data, family = gaussian, 
  prior_intercept = normal(50, 5),
  prior = normal(0, 0.005, autoscale = TRUE), 
  prior_aux = exponential(0.001, autoscale = TRUE),
  chains = 4, iter = 4000*2, seed = 84735, verbose=FALSE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2.1e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.184738 seconds (Warm-up)
## Chain 1:                0.282261 seconds (Sampling)
## Chain 1:                0.466999 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 3.7e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.37 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.181171 seconds (Warm-up)
## Chain 2:                0.282274 seconds (Sampling)
## Chain 2:                0.463445 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.9e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.177909 seconds (Warm-up)
## Chain 3:                0.279587 seconds (Sampling)
## Chain 3:                0.457496 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.8e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.18696 seconds (Warm-up)
## Chain 4:                0.269333 seconds (Sampling)
## Chain 4:                0.456293 seconds (Total)
## Chain 4:
  1. Evaluating models
pp_check(penguin_model_8)

pp_check(penguin_model_9)

pp_check(penguin_model_10)

prediction_summary_cv(model = penguin_model_8, data = penguin_data, k = 10)
## $folds
##    fold      mae mae_scaled within_50 within_95
## 1     1 2.261209  0.5042860 0.7037037 0.9259259
## 2     2 3.315736  0.7566139 0.4615385 0.9230769
## 3     3 2.326412  0.5135774 0.6296296 0.9629630
## 4     4 2.130073  0.4711664 0.6923077 1.0000000
## 5     5 2.649268  0.5910488 0.5925926 0.9629630
## 6     6 4.275919  0.9758033 0.3076923 0.9615385
## 7     7 3.082429  0.6899118 0.5000000 0.9615385
## 8     8 3.273684  0.7224448 0.4444444 1.0000000
## 9     9 2.805641  0.6343910 0.5384615 0.9615385
## 10   10 4.601239  1.0966927 0.3703704 0.8888889
## 
## $cv
##        mae mae_scaled within_50 within_95
## 1 3.072161  0.6955936 0.5240741 0.9548433
prediction_summary_cv(model = penguin_model_9, data = penguin_data, k = 10)
## $folds
##    fold      mae mae_scaled within_50 within_95
## 1     1 1.685671  0.6685488 0.5185185 0.9629630
## 2     2 1.234169  0.4742710 0.6923077 1.0000000
## 3     3 1.886409  0.7328654 0.4444444 0.9629630
## 4     4 1.444000  0.5696574 0.6153846 0.9615385
## 5     5 1.729281  0.6861726 0.4814815 0.8888889
## 6     6 1.554756  0.6106289 0.5384615 0.9615385
## 7     7 1.210328  0.4717706 0.6153846 0.9615385
## 8     8 2.184668  0.8537932 0.4444444 1.0000000
## 9     9 2.037541  0.7879526 0.3461538 1.0000000
## 10   10 1.723806  0.6865360 0.4444444 0.9629630
## 
## $cv
##        mae mae_scaled within_50 within_95
## 1 1.669063  0.6542197 0.5141026 0.9662393
prediction_summary_cv(model = penguin_model_10, data = penguin_data, k = 10)
## $folds
##    fold      mae mae_scaled within_50 within_95
## 1     1 4.167887  0.7877940 0.4444444 1.0000000
## 2     2 3.566254  0.6726162 0.5000000 1.0000000
## 3     3 5.620458  1.0802408 0.3333333 0.9259259
## 4     4 4.081782  0.7712352 0.4615385 1.0000000
## 5     5 4.318533  0.8194117 0.3703704 1.0000000
## 6     6 3.639383  0.6973355 0.5000000 0.9615385
## 7     7 4.918917  0.9788260 0.2307692 0.9615385
## 8     8 3.594653  0.6806221 0.4814815 1.0000000
## 9     9 4.047367  0.7763710 0.3846154 1.0000000
## 10   10 3.939797  0.7596770 0.4444444 0.9259259
## 
## $cv
##        mae mae_scaled within_50 within_95
## 1 4.189503   0.802413 0.4150997 0.9774929

As we can see from both the pp_check plots and the error statistics, the third model is significantly worse than the first two, likely due to the fact that the priors for the first two models were inferred from the data, while for the last model the prior was not able to be inferred from the data. Of the first two models, I prefer the second, for the posterior predictive distribution better matches the observed distribution, and the error statistics are smaller.

Exercise 11.15

Downloading weather data

data("weather_perth")
# Alternative penguin data
weather_data <- weather_perth |> drop_na()

First, check out the data

ggplot(weather_data, aes(x = windspeed9am, y = windspeed3pm)) +
 geom_point() + 
  geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'

Building weather model

weather_model <- stan_glm(
  windspeed3pm ~ windspeed9am,
  data = weather_data, family = gaussian, 
  prior_intercept = normal(10, 2),
  prior = normal(0.4, 0.005, autoscale = TRUE), 
  prior_aux = exponential(0.001, autoscale = TRUE),
  chains = 4, iter = 4000*2, seed = 84735, verbose=FALSE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2.5e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.25 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.170268 seconds (Warm-up)
## Chain 1:                0.587048 seconds (Sampling)
## Chain 1:                0.757316 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 2.6e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.26 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.177233 seconds (Warm-up)
## Chain 2:                0.581812 seconds (Sampling)
## Chain 2:                0.759045 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 2.3e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.23 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.180333 seconds (Warm-up)
## Chain 3:                0.608701 seconds (Sampling)
## Chain 3:                0.789034 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 2.4e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.24 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.175595 seconds (Warm-up)
## Chain 4:                0.583651 seconds (Sampling)
## Chain 4:                0.759246 seconds (Total)
## Chain 4:
  1. Evaluating model
pp_check(weather_model)

prediction_summary_cv(model = weather_model, data = weather_data, k = 10)
## $folds
##    fold      mae mae_scaled within_50 within_95
## 1     1 3.238853  0.6902631      0.49      0.93
## 2     2 3.951084  0.8368496      0.42      0.98
## 3     3 3.277449  0.6917662      0.49      0.95
## 4     4 3.402771  0.7201394      0.48      0.95
## 5     5 3.484423  0.7303913      0.43      0.97
## 6     6 4.065505  0.8732405      0.36      0.95
## 7     7 3.849420  0.8268458      0.43      0.93
## 8     8 3.524971  0.7341515      0.46      0.98
## 9     9 3.373769  0.7124051      0.48      0.95
## 10   10 2.701749  0.5648839      0.53      0.97
## 
## $cv
##        mae mae_scaled within_50 within_95
## 1 3.486999  0.7380936     0.457     0.956

The pp_check illustrates that the posterior predictive disribution pretty closely maps onto the observed distribution. The error statistics confirm this: the MAE scaled is 0.736 and around 46% of data values are within the 50% posterior predictive interval. Therefore, I’d say the model is pretty good, with the caveat that the prior values were inferred from the data, so it is at risk of overfitting.